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DETAILED ACTION 

1 . Elected Claims 1 -1 6 are rejected in this Office Action. 

Claim Rejections - 35 USC §112 

2. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter, which the applicant regards as his invention. 

Claim 1 recites the limitation "... the cache behavior ..." in the preamble. 
There is insufficient antecedent basis for this limitation in the claim. 

Claim 12 recites the limitation "... the cache behavior ..." in the preamble. 
There is insufficient antecedent basis for this limitation in the claim. 

Claim Rejections - 35 USC § 101 

3. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or 
composition of matter, or any new and useful improvement thereof, may obtain a patent 
therefor, subject to the conditions and requirements of this title. 

Claims 1 and 12 are rejected under 35 U.S.C. 101 because the claimed 
invention is directed to non-statutory subject matter. 

Claim 1 recites a method of improving the cache behavior comprising the 
steps of "associating with each node ... node"; "representing ... MBR"; and 
"compressing ... quantization". The steps are broadly recited without regard to 
any tangible way of implementing them that they are directed to the abstract 
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idea. The abstract idea comprising the steps are not instantiated into some 
specific physical implementation which would result in a practical application 
producing a concrete, useful, and tangible result to form the basis of statutory 
subject matter under 35 U.S.C. 1 01 . 

Claim 12 recites a of improving the cache behavior comprising the steps 
of "associating with each node ... node"; "representing ... shape"; and 
"compressing ... quantization". The steps are broadly recited without regard to 
any tangible way of implementing them that they are directed to the abstract 
idea. The abstract idea comprising the steps are not instantiated into some 
specific physical implementation which would result in a practical application 
producing a concrete, useful, and tangible result to form the basis of statutory 
subject matter under 35 U.S.C. 101. 

Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 
122(b), by another filed in the United States before the invention by the applicant for patent or 
(2) a patent granted on an application for patent by another filed in the United States before 
the invention by the applicant for patent, except that an international application filed under 
the treaty defined in section 351(a) shall have the effects for purposes of this subsection of an 
application filed in the United States only if the international application designated the United 
States and was published under Article 21 (2) of such treaty in the English language. 

Claims 1-2, 6-9, and 11 are rejected under 35 U.S.C. 102(e) as being 
anticipated by U.S. Patent No. 6,470,344 issued to Kothuri et al. ("Kothrui"). 



Application/Control Number: 10/087,360 Page 4 

Art Unit: 2163 

As to Claim 1, Kothuri teaches "a method of improving the cache behavior 
of accessing a multidimensional index structure resident in main memory for 
facilitating reference to data objects stored in a database, where the index 
structure consists of internal nodes having pointers to child nodes and leaf nodes 
having to database objects, the method comprising the steps of: 

• associating with each node a minimum bounding rectangle ("MBR"), 
wherein each MBR is the minimal hyper-rectangle enclosing the 
corresponding data object in the case of a leaf node and all the hyper- 
rectangles in the child node in the case of an internal node" (Figures 1A 
and 7 illustrate associating with each node a minimum bounding 
rectangle, column 9, lines 1-21); 

• Kothuri teaches "representing each of one or more said MBRs by a 
relative representation of an MBR ("RMBR") that is the coordinates of the 
MBR represented relative to the coordinates of a reference MBR" (Figures 
3 and 6A display the relative representation of a minimum bounding 
rectangle, column 11, lines 47-53); and 

• "compressing each RMBRs into a quantized, RMBR ("QRMBR") by 
quantizing each RMBR to finite precision by cutting off trailing insignificant 
bits after quantization" (column 11, lines 60-67 and column 12, lines 1-62 
wherein illustrates an equivalent process of cutting off trailing insignificant 
bits after quantization using truncate or rounded down procedure). 
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As to Claim 2, Kothuri teaches "wherein said multi-dimensional index 
structure is an R-tree" (Figure 4, illustrates the R-tree, column 12, lines 55-56). 

As to Claim 6, Kothuri teaches "wherein each internal node has a plurality 
of entries where the first entry has a QRMBR and a pointer while the rest of the 
entries have only QRMBRs" (column 12, lines 49-54). 

As to Claim 7, Kothuri teaches "wherein each node stores a reference 
MBR" (Figures 1A and 1B illustrate each node stores a reference MBR). 

As to Claim 8, Kothuri teaches "wherein the reference MBR of a node is 
obtained from the corresponding QRMBR stored in the node's parent node" 
(column 12, lines 49-54). 

As to Claim 9, Kothuri teaches "wherein the internal nodes store QRMBRs 
while the leaf nodes store MBRs" (column 9, lines 10-21). 

As to Claim 11, Kothuri teaches "wherein said database resides in disk" 
(column 6, lines 14-16). 

Claims 12-14 and 16 are rejected under 35 U.S.C. 102(e) as being 
anticipated by U.S. Patent No. 6,470,344 issued to Kothuri et al. ("Kothrui"). 

As to Claim 12, Kothuri teaches "a method of improving the cache 
behavior of accessing a multidimensional index structure resident in main 
memory for facilitating reference to data objects stored in a database, where the 
index structure consists of internal nodes having pointers to child nodes and leaf 
nodes having to database objects, the method comprising the steps of: 
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• associating with each node a minimum bounding shape, a multi- 
dimensional shape enclosing the corresponding data object in the case of 
a leaf node and all the minimum bounding shapes in the child node in the 
case of an internal node" (Figure 1, column 8, lines 19-39); 

• "representing each of one or more said minimum bounding shape by a 
relative representation that is the coordinates of the minimum bounding 
shape represented relative to the coordinates of a reference minimum 
bounding shape" (Figures 3 and 6A display the relative representation of a 
minimum bounding shape wherein the shape is equivalent to the 
illustrated rectangle, column 1 1 , lines 47-53); and 

• "compressing each relative representation into a quantized representation 
by quantizing each relative representation to finite precision by cutting off 
trailing insignificant bits after quantization" (column 11, lines 60-67 and 
column 12, lines 1-62 wherein illustrates an equivalent process of cutting 
off trailing insignificant bits after quantization using truncate or rounded 
down procedure). 

As to Claim 13, Kothuri teaches "wherein each internal node has a 
plurality of entries where the first entry has a quantized representation and a 
pointer while the rest of the entries have only quantized representations" (column 
12, lines 49-54). 
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As to Claim 14, Kothuri teaches "wherein the reference minimum 
bounding shape of a node is obtained from the corresponding quantized 
representation stored in the node's parent node" (column 12, lines 49-54). 

As to Claim 16, Kothuri teaches "wherein said database resides in disk" 
(column 6, lines 14-16). 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described 
as set forth in section 102 of this title, if the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

Claims 3-5 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over U.S. Patent No. 6,470,344 issued to Kothuri et al. ("Kothrui") as applied to 
claim 1 above, and further in view of U.S. Patent No. 6,868,410 issued to Fortin 
et al. ("Fortin"). 

As to Claim 3: 

Kothuri teaches the elements of Claim 1 as stated above. 

Kothuri does not explicitly teach "wherein said multi-dimensional index 
structure is an RMree". 

Fortin teaches "wherein said multi-dimensional index structure is an R*- 
tree" (column 8, lines 31-65). 
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It would have been obvious to a person of ordinary skill in the art at the 
time of the invention to combine the teaching of Fortin with the teaching of 
Kothuri in order to provide a method or system which integrates multi- 
dimensional index structure such as an RMree, thereby providing a method or 
system which further improved storage or memory utilization and robustness in 
processing data distribution. 

As to Claim 4: 

Kothuri teaches the elements of Claim 1 as stated above. 

Kothuri does not explicitly teach "wherein said multi-dimensional index 
structure is an R+-tree". 

Fortin teaches "wherein said multi-dimensional index structure is an R+- 
tree" (column 8, lines 31 -65). 

It would have been obvious to a person of ordinary skill in the art at the 
time of the invention to combine the teaching of Fortin with the teaching of 
Kothuri in order to provide a method or system which integrates multi- 
dimensional index structure such as an R+-tree, thereby providing a method or 
system which further reduced overlap of minimum bounding rectangles. 

As to Claim 5: 

Kothuri teaches the elements of Claim 1 as stated above. 



Application/Control Number: 10/087,360 Page 9 

Art Unit: 2163 

Kothuri does not explicitly teach "wherein said multi-dimensional index 
structure is a Hilbert R-tree". 

Fortin teaches "wherein said multi-dimensional index structure is a Hilbert 
R-tree" (column 8, lines 31-65). 

It would have been obvious to a person of ordinary skill in the art at the 
time of the invention to combine the teaching of Fortin with the teaching of 
Kothuri in order to provide a method or system which integrates multi- 
dimensional index structure such as a Hilbert R-tree, thereby providing a method 
or system which further improved storage or memory utilization. 

Claim 10 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
U.S. Patent No. 6,470,344 issued to Kothuri et al. ("Kothrui") as applied to claim 
1 above, and further in view of "Compacting Discriminator Information for Spatial 
Trees by Inga Sitzmann and Peter J. Stuckey, Copyright 2001, Australian 
Computer Society, Inc. 

As to Claim 10: 

Kothuri teaches the elements of Claim 1 as stated above. 
Kothuri does not explicitly teach "wherein said database resides in main 
memory". 

Sitzmann and Stuckey teach "wherein said database resides in main 
memory" (Abstract, page 167). 
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It would have been obvious to a person of ordinary skill in the art at the 
time of the invention to combine the teachings of Sitzmann and Stuckey with the 
teaching of Kothuri in order to provide a method or system wherein database will 
fit entirely in main memory (Sitzmann and Stuckey, Introduction). 

Claim 15 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
U.S. Patent No. 6,470,344 issued to Kothuri et al. ("Kothrui") as applied to claim 
12 above, and further in view of "Compacting Discriminator Information for 
Spatial Trees by Inga Sitzmann and Peter J. Stuckey, Copyright 2001 , Australian 
Computer Society, Inc. 

As to Claim 15: 

Kothuri teaches the elements of Claim 12 as stated above. 
Kothuri does not explicitly teach "wherein said database resides in main 
memory". 

Sitzmann and Stuckey teach "wherein said database resides in main 
memory" (Abstract, page 167). 

It would have been obvious to a person of ordinary skill in the art at the 
time of the invention to combine the teachings of Sitzmann and Stuckey with the 
teaching of Kothuri in order to provide a method or system wherein database will 
fit entirely in main memory (Sitzmann and Stuckey, Introduction). 
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Any inquiry concerning this communication or earlier communications from 
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If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Safet Metjahic can be reached on 571-272-4023. The fax 
phone number for the organization where this application or proceeding is 
assigned is 571-273-4033. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
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free). 
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Abstract 

Cache-conscious behaviour of data structures becomes more 
important as memory sizes increase and whole d^tabaseii fit 
into mam memory For spatial data, R-trees, originally de- 
signetr^r^^pDSed data, can be adopted for in-memory ap- 
plications. In this paper, we will investigate how the small 
amount of space in an in-memory R- tree node can be used bet- 
ter to make R-trees more cache-conscious. We observe that 
many entries share sides with their parents, and introduce the 
partial R-tree which only stores information that is not given 
by the parent node. Our experiments showed that the partial 
R-tree shows up to 30 per cent better performance than the 
traditional R-tree. We also investigated if we could improve 
the search performance by storing more descriptive informa- 
tion instead of the standard minimum bounding box without 
decreasing the fanout of the R-tree. The partial static O-tree 
is based on the O-tree, but stores only the most important part 
of the information of an O-tree box. Experiments showed that 
this approach reduces the search time for line data by up to 60 
per cent. 

Keywords: Index structures, spatial databases, in- 
memory databases. 

1 Introduction 

Latest surveys have shown that the availability of 
ch eap memory will lead to computer systems with 
main memory sizes in the order of terrabytes ove r 
thenext 10 years | Bernstein et al.^ jggg] ^ Many data- 
bftse^ will tfeen fit entirejy^ For data- 

base tecnnblogy, this means'thai the fractional bot- 
tleneck of memory-disk latency will be replaced by the 
cpu-memory latency as the crucial factor for database 
performance. It is therefore increasingly important 
for database index structures and algorithms to be- 
come sensitive to the cache behaviour. 

Recent research on index structures includes sev- 
eral papers addressing cache-sensitive index struc- 
tures. In [Rao and Ross, 2000], the authors propose 
a pointer elimination technique for B+- trees. JJpties,, 
qf the CSB^tree are^ 

fore 6nly ; the vppinter to the first child node neecls 

Copyright ©2001, Australian Computer Society, Inc. This pa- 
per appeared at the Thirteenth Australasian Database Confer- 
ence (ADC2002), Melbourne, Australia. Conferences in Research 
and Practice in Information Technology, Vol. 5. Xiaofang Zhou, 
Ed. Reproduction for academic, not-for profit purposes permit- 
ted provided this text is included. 



t9 be stored in the parent. This pointer elimina- 
tion technique was extended to spatial data struc- 
tures in the paper [Ross et al., 2001] which intro- 
duced cost-based unbalanced R-trees (CUR-trees). 
The cost factors of the cache behaviour of a given 
architecture can be modeled into a cost function 
and a CUR-tree is built which is optimized with re- 
spect to the cost function and a given query model. 
Prefetch instructions in combination with multiple- 
size nodes can further assist to achieve better cache 
behaviour [Chen et al., 2001]. Other methods im- 
prove the space utilization by compressing the entries 
Mlrt^ [Kim et al., 2001]. 

The values describing the discriminator or minimum 
bounding box (MBB) in R-trees are represented rel- 
atively to its parent and the number of bits per value 
is reduced by mapping the values to a coarser rep- 
resentation. The problem of cache-sensitivity of data 
structures with complex keys is addressed in the pa- 
per [Bohannon et al., 2001] which uses partial keys of 
fixed size. 

In this paper, we will investigate how we can make 
better use of in-memory spatial tree nodes by elim- 
inating unnecessary information in the discriminator 
and storing more descriptive information than the 
minimum bounding boxes traditionally used to ap- 
proximate the spatial objects. 

In-memory R-trees typically have very small node 
sizes, between 3 and 7 entries, because the natural 
node size is determined by the size of a cacheline. We 
observed that for such R-trees a substantial part of 
the information is stored multiple times at different 
levels in the tree. Very often, entries in the nodes 
share at least one of their sides with the enclosing 
parent bounding box. 

We propose the use of partial information in R- 
tree nodes and show how this can improve space util- 
ization and performance of R-trees. We introduce the 
concept of partial R-trees where information is handed 
down from parent to children to eliminate multiple 
storing of values. Only values which add information 
about the entry that is not already given in the parent 
are stored in the node. 

We also investigate how more descriptive informa- 
tion than the standard MBB can be stored while not 
using up additional space compared to the standard 
R-tree. The partial static O-tree is based on the O- 
tree [Sitzmann and Stuckey, 2000], but stores at most 
four values only, thus not increasing the fanout of the 
tree while providing better discriminator information 
and thus improving the search. 



The structure of the paper is as follows. Section 2 
briefly describes R-trees and talks about their in- 
memory use. We present partial R-trees in Section 3, 
and describe how redundant information can be elim- 
inated and how this affects insertion and search. Sec- 
tion 4 describes how to store more descriptive inform- 
ation about the entries in partial O-trees. We present 
experimental results in Section 5 before concluding 
our paper in Section 6 with a summary and an out- 
look to future work in Section 7. 



1 H c2 1| c3 | 



sO 


si 


52 


S3 



Figure 1: Memory layout of a CSB+-R-tree node 



2 Using R-trees in main memory 

2.1 The R-tree structure 

The R-tree and its variants [Guttman, 1984, 
Beckmann et al., 1990] is a data structure for n- 
dimensional data. It has originally been designed to 
index disk-based data. 

An R-tree node t has a number of subtrees £.n, 
and for each 1 < i < t.n a discriminator or minimum 
bounding box (MBB) t[i].d (which is an array of four 
side values: left, right, bottom, top), and a pointer 
t[i].t. The pointer points at an object identifier if the 
node is a leaf node. Otherwise it points at another R- 
tree node. R-trees are built with a maximum number 
of entries per node M and a minimum number of 
entries per node m < M/2. 

Further rules which determine the shape of the R- 
tree are: 

1. All leaf nodes appear on the same level. 

2. Every node which is not the root node contains 
between m and M entries. 

3. The root node has at least two entries unless it 
is a leaf node. 

If insertion of an entry in a node results in an 
overfull node, the node is split. Splitting al- 
gorithms with linear and quadratic complexity have 
been presented in the literature [Guttman, 1984, 
Beckmann et al., 1990]. 

2.2 Differences between disk and in-memory 
- R-trees 

The concept of the R-tree to organize spatial objects 
in a balanced search tree by using minimum bound- 
ing boxes (MBBs) as discriminators works well in the 
disk-based context. In this context, the I/O cost of 
R-trees is directly dependent on the number of block 
accesses, and is very competitive compared to other 
spatial access methods. This approach can also be 
transfered to in-memory use of R-trees and works sim- 
ilarly well with respect to the cache-behaviour of R- 
trees, where performance depends on the number of 
cache misses occurring. Thus, the strength of the R- 
tree concept is the same for disk-based and in-memory 
use. 

Looking at the size of the R-tree nodes, we ob- 
serve that, as expected, the number of entries per 
node in the in-memory case is significantly smaller. 
In the disk-based case with a page size of 4 KBytes, 
we can fit 170 entries into one node of a regular R- 
tree (assuming 4 bytes per number) and 255 if we use 
the CSB+-technique [Rao and Ross, 2000] to elimin- 
ate pointers. For in-memory R-trees using a cacheline 



of 64 bytes, a normal R-tree node using 4 bytes per 
number can only store 3 entries. The CSB+-pointer 
dimination technique does not increase the fanout of 
the tree in this case. Assuming 2 bytes per number, 
we can fit 5 entries, and with the CSB-h-technique, 7 
entries per node. 

Figure 1 shows the memory layout of a CSB+-R- 
tree node. A node consists of the pointer p to the 
contiguous array of child nodes, a counter n for the 
number of entries and entries ei to e„. Each entry e< 
consists of four sides s 0 to S3. As the child nodes of a 
CSB+-R-tree node t are stored contiguously, we can 
refer to the i th child node of t as t.p + (t - 1) (using 
C style pointer addition), using the base pointer t.p 
and adding the offset t. In the conventional R-tree 
notation, this corresponds to t[t]-t v the i th child node 
of a conventional R-tree node. 

Analyzing the characteristics of small R-tree 
nodes, we can make some observations which are quite 
different to the large R-tree nodes used for disk-based 
data: 

• the discriminator MBBs are usually very small 
and entries share sides with the enclosing MBB 

• increasing the fanout of small nodes has an 
immediate effect on the height of the tree 
(r%s(10 e )l = 13 and \log 4 (W)] = 10) 

We investigate whether we can improve performance 
of in-memory R-trees by making use of the two prop- 
erties stated above. 

3 Eliminating multiply stored information 

3.1 Analyzing R-trees with small nodes 

The idea for partial R-trees was motivated by a small 
experiment. Using simple datasets of lines and poly- 
gons (see Section 5), we counted how many discrimin- 
ators in an R-tree share one or more sides with their 
parents when using nodes of small sizes. The results 
in Table 1 are for an R-tree with 3 entries per node. 



File 


Sides shared 


with parent (i 


n%) 


Avg. 
sides 


1 


2 


3 


4 


0 


lioooo 


30.14 


49.50 


13.38 


ti.ol 


6.80 


2.30 


150000 


31.20 


48.03 


14.47 


0.01 


6.15 


2.28 


1100000 


30.93 


48.35 


14.55 


0.01 


6.03 


2.28 


plOOOO 


27.35 


29.67 


18.72 


7.40 


16.79 


2.27 


pSOOOO 


27.74 


28.80 


10.09 


7.73 


16.61 


2.26 


plOOOOO 


27.79 


20.04 


19.12 


7.75 


16.27 


2.25 



Table 1: MBB sides shared with parent MBB 
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On average, only 2.25 to 2.3 sides of a discrimin- 
ator have to be stored per entry as shown in the last 
column. For all files, at least 83 per cent of discrim- 
inators share at least one side with their parent. We 
will propose a method where the information from the 
parent discriminator is used to construct the complete 
box and show that this increases the fanout of the R- 
tree nodes, resulting in better search performance. 

3.2 A more compact R-tree representation 

We propose storing only those sides of a discriminator 
which the entry does not have in common with its 
parent. Consider the objects depicted in Figure 2. 



Ol 










02 


1 [5] 



Figure 2: Objects in an R-tree node 



The dashed line represents the minimum bounding 
box of a node that contains the three objects Ol, 02, 
and 03. As 01 shares the left and top side with the 
parent, we only need to store the right and bottom 
side. For 02, only the left side needs to be stored as 
the other three sides are given by the parent discrim- 
inator. All four sides for 03 must be stored in the 
node as the entry shares no side with the parent. 

The structure of an R-tree node with partial in- 
formation will therefore be more dynamic than the 
traditional R-tree node structure. Instead of storing 
four sides for every discriminator, we store between 
0 and 4 sides per entry. The structure is shown in 
Figure 3. 




Figure 3: The node structure of a partial R-tree 



The partial R-tree node contains a pointer and a 
counter for the number of entries in the node. The 
first part of the node then contains a sequence of 4- 
bit fields. The bitfields correspond to the entries in 
ascending order. Each bitfield indicates which sides 
are stored in the node and which sides have to be 
inherited from the parent. The actual sides are stored 
in reverse order starting from the end of the node. 
This allows us to make full use of the space of the 
R-tree node and to minimize copy and comparison 
operations during insertion and search. 



As Table 1 shows, we only store about 2.25 sides 
per discriminator. Assuming that a side is 4 bytes 
large, we save 1.75x4 = 7 bytes, which corresponds to 
56 bits. Using 4 bits per entry for extra information, 
we still save 52 bits per entry. 

A partial R-tree node 8 has a base pointer s.p, a 
number of entries s.n, and for each 1 < t < s.n a 
4-bit array s[i].b indicating which sides differ from 
the parent discriminator. The remainder of the node 
is an array of side values s.s where s.s\j] is the j th 
side value in the node (corresponding to the side of 
the j th true bit occurring in the bit fields). Here we 
ignore the fact that this array is stored in reverse 
order. We assume s.l gives the number of sides stored 
in the partial R-tree node. Note, this is not actually 
stored in the node, since it can be calculated from the 
number of true bits in the 4-bit fields. 

3.3 Creating the partial representation 

Given an R-tree node t with parent discriminator pd, 
we can create a corresponding partial R-tree node s 
by mapping each entry in t to its partial information 
using the algorithm ToPartial. 



ToPartial(*,pd) 
s.p := t.p 
s.n := t.n 
s.l := 0 

for i := 1 to t.n 
for j := 0 to 3 

» ip<m = mm) 

else 

s[i].J>[j] := true 
s.s[s.l] := t\i).d[j] 
s.l := s.l + 1 
return s 



For each discriminator t[i].d occurring in the R- 
tree node we check each side j versus the parent dis- 
criminator. If it is different to the corresponding side 
in the parent, we set the corresponding bit s[i].6(j] 
and store the side in the next available space in the 
array of sides s.s. The counter s.l keeps track of how 
many sides are stored in s. 

3.4 Extracting information from a partial R- 
tree 

To read a discriminator in a partial R-tree node, we 
need to combine the information from the parent dis- 
criminator with the information stored in the node. 
The algorithm ToComplete converts a partial R-tree 
node s together with its parent discriminator pd to a 
(complete) R-tree node. 



ToComplete(s,pd) 
t.p := s.p\ t.n := s.n 
I :=0 

for * := 1 to s.n 
t\i].d :=pd 
for j := 0 to 3 



if («M.6H) 

mi? 

return t 



s[l) 



After appropriately setting the number of sub- 
trees, the count of sides is initialized, and then each 
bit field is examined in turn. The discriminator is 
initialized to the parent discriminator and then, for 
each different side (marked by a true bit), the next 
side value is copied in. 

The R-tree node t, returned from ToCompIete for 
partial R-tree node s, is identical to the R-tree node 
that is given to ToPartial to construct s. No inform- 
ation is lost. It is important to note that for search 
operations we do not need to restore the entries com- 
pletely. We can rather use the partial information to 
perform search as we will describe below. 

3.5 Building partial R-trees 

The insertion procedure of an object in a partial R- 
tree requires some extra steps in comparison to the 
standard R-tree insertion. The algorithm Insert de- 
scribes the insertion of an object o with discriminator 
e into a partial R-tree s with discriminator pd. The 
algorithm effectively first converts each partial R-tree 
node visited to its corresponding (complete) R-tree 
node, performs the insertion and then converts back 
to a partial R-tree node. 



When inserting an object into an external node, 
we convert the node to a complete R-tree node and 
simply add the entry naively. The new parent dis- 
criminator is pd U e, that is the minimal bounding 
box that includes both pd and e. We then convert 
the expanded (completed) R-tree node back to a par- 
tial R-tree node. 

As the representation of the node depends on the 
parent discriminator, the representation of entries 
other than the new entry may have changed. For 
example, an entry previously sharing two sides with 
the parent might now only share one side with the 
new, larger parent discriminator. The function Full 
tests whether the new representation is too large to 



fit into the bits available in a node. A partial R-tree 
node s is full if s.n x 4 + $.1 x sidebits is too great 
to fit in the node (together with the bits for s.n and 
the pointer s.p). The number sidebits refers to the 
number of bits used to store a side (we use 32 bits). 

If the new representation of the node is too large 
for the node, either because there was no space for 
the new entry, or its addition caused other entries to 
no longer fit, the node is then split. We use the quad- 
ratic splitting algorithms of [Beckmann et al., 1990]. 
The Split algorithm splits the node s into two nodes 
containing the same entries that will each fit in the 
available space. The changes are then passed back to 
the parent node. 

For internal nodes, we convert to a (complete) R- 
node and select the best subtree for insertion. We 
choose the subtree whose discriminator shows the 
smallest increase in area after insertion of the new 
entry. Insertion is then continued in the selected sub- 
tree. The results of the insertion on the lower level 
may be one or two subtrees. These are added to the 
R-tree node, which is then converted back to a par- 
tial R-tree node. Note that we need to recalculate the 
representation of the node even when insertion on a 
lower level returns one subtree since the parent dis- 
criminator may have changed. This means, that we 
might have to split a node on an internal level, even 
if no new entry was added to it by insertion lower in 
the tree. We check if a split is necessary after the 
new representation has been determined and propag- 
ate the changes to the parent level. 

The algorithms shown convert partial R-tree nodes 
to R-tree nodes and back again for most processing, 
for ease of explanation. In the implementation most 
conversion of discriminators is avoided. 

3.6 Searching partial R-trees 

When searching a partial R-tree, we do not need to 
reconstruct the complete entry before we can determ- 
ine whether search should continue in a child tree or 
not. Instead, we only need to compare the stored 
sides of an entry with the query as we know that the 
inherited sides match the query. 

This is clarified by Figure 4. Imagine we have de- 
termined that query box q intersects with the parent 
discriminator pd of some entry with discriminator e. 
In order to determine this we have checked that the 
box q does not lie completely above pd, to the left 
of pd % etc. These comparisons are represented by the 
four dashed arrows. In order to determine that q in- 
tersects e we do not need to consider whether it lies 
completely below e or to the right since these com- 
parisons were already made with the parent box pd. 
Hence the only two comparisons required are with re- 
spect to the sides of e not shared with the parent. 

We search an R-tree node by simultaneously scan- 
ning through the bitfield at the start of the node and 
the entries stored at the back of the node. For each 
entry in an internal node, we check whether the stored 
sides clash with the query. We use the bit fields to 
determine which sides stored in s.s refer to which 
discriminator sides. We know that all the sides not 
stored in the node are the same as in the parent dis- 
criminator and therefore match the query. 

We compare the opposite side opp(j) of the query 



lnsert(s,pd,e,o) 
case s 
external: 
t := ToComplete(s,pd) 
tn := tn + 1; t.p + tn := o; t[t.n].d := e 
s := ToPartialft, pdUe) 
if (Full(s)) 

(sl,sr y dl,dr) := Split(s,pdU e) 
else 

(s/, sr,d/,dr) := (s y null,pdU e, {}) 
return (s/,sr,d/,dr) 
internal: 
t := ToComplete(s, pd) 
i := ChooseSubtree(i,e) 
(sl,sr,dl,dr) := lnsert(s.p + i, s[i].dU e, e,o) 
%% replace t[i] by tl and tr 
Lp + i:= si; t[i).d := dl 
if (sr ^ null) 

t.n :— t.n -h 1; t.p -f tn sr; t[tn].d := dr 
s T6Partial(i, pd U e) 
if (Full(s)) 

(si,sr,dJ,<fr) := Split(s,pdUe) 
else 

(sl,sr,dl y dr) := (s,nuf/,pdU e, {}) 
return (sl } sr,dl,dr) 
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Figure 4: Reducing comparisons with partial R-trees 



with side j 9 in the appropriate direction dirn(j) (-1 
for right and top, 1 for left and bottom). If the entry 
matches the query, we continue search on the level 
below. Otherwise, we consider the next entry. For 
each entry in an external node, we compare all stored 
sides with the query and output the entry if it matches 
the query. 



The search algorithm shows that the redundancy 
in a normal R-tree not only occurs when storing sides, 
but also when performing comparisons. By using par- 
tial R-trees, we reduce the number of comparisons 
during search at the same time as we reduce the num- 
ber of sides stored in an entry. 

4 Storing better information 

In the partial R-tree, we try to improve the perform- 
ance of R-trees by eliminating redundant information, 
thus creating more room for other entries which res- 
ults in an increased fan-out of the node. Alternat- 
ively, we can try to improve the search behaviour 
of the tree by using the gained space to store more 
descriptive information than the standard minimum 
bounding box. This will help to filter out unsuccess- 
ful search paths at a higher level in the tree. In the 
paper [Sitzmann and Stuckey, 2000], we introduced 
the O-tree, a constraint-based data structure, which 
stores an orthogonal box in addition to the standard 
bounding box to give a better description of the ob- 
jects in the tree. 



Figure 5: Approximation of polygons in a R-tree and 
O-tree 



4.1 The O-tree approach 

The structure of the O-tree is similar to the R-tree. 
The difference is that we store two minimum bound- 
ing boxes per entry: the conventional MBB and an 
additional MBB along axes v and w which leave the 
origin at an angle of tt/4 to the x and y axes. The 
values of v and w can be obtained as v = (x 4- y)/y/2 
and w = (y - x)/y/2. Thus, an object in an O-tree is 
described by eight values, representing the lower and 
upper bounds on the four axes x, y y v and w. We can 
store an O-tree discriminator in an array of eight side 
values where the lower and upper bound of the x axis 
are stored in sides 0 and 1, sides 2 and 3 refer to the 
y axis, sides 4 and 5 to the v axis, and sides 6 and 7 
to the w axis. 

As shown in Figure 5, the standard MBB, depicted 
with solid lines, is a very poor representation for some 
kinds of data. Although, the two shaded polygons 
are far from intersecting each other, an intersection 
test based on the MBBs indicates an overlap. More 
information about the object is given if we describe 
the object with an additional orthogonal box (with 
dashed lines). Using the intersection of both boxes, 
the lack of overlap of the two polygons is clear. 

The O-tree representation is particularly useful for 
line data. Figure 6 compares the area of the discrim- 
inator in an R-tree and O-tree for line data. 




COS0 



Figure 6: Size of the O-tree bounding box and the 
R-tree bounding box for line data 



When storing a 2-d unit length line at an angle 0 < 
0 < 7r/8 to the horizontal, the area of the bounding 
box is cos(0) sin(0). In an O-tree, on the other hand, 
the area of the intersection of the bounding boxes is 
cos(0)sin(0) - sin 2 (0). This means that the O-tree 
region bounding a line is on average 2 - | « 0.43 
times the area of the R-tree minimum bounding box. 

In our experiments, we found that O-trees indeed 
improve the accuracy of the search significantly. But 
the disadvantage of O-trees also became apparent: as 
we store eight numbers per entry instead of four, the 
fanout of the tree is significantly reduced. Therefore, 
the overall search performance could only be slightly 
improved for line data intersection queries. 



Search(s,g) 
/:=0 

for i := 1 to s.n 
match := true 
for j := 0 to 3 

if (s[i\.b\j}) 
if (dirn(jf) x s.$[J++] > 
dirn(j) x g.d[opp(;)])) 
match := false; break 
if (match) 
case (s) 

external: Output(s.p + i) 
internal: Search(s.p + i, q) 

dirn(i) = l-2x(i mod 2) 

opp(i) = 2x(t-r2) + l-(t mod 2) 



The O-tree as first presented is impractical for in- 
memory use. For a typical 64 bytes cacheline, we can 
only fit 2 entries in a node. 

In this paper, we transfer the O-tree approach to 
in-memory data structures and try to overcome the 
weakness of the O-tree by storing only the four most 
descriptive sides of the O-tree and combine this in- 
formation with information obtained from the parent 
discriminator. 

4.2 A compact O-tree representation 

Although eliminating shared sides also reduces the 
number of sides per discriminator in an O-tree, on 
average, we still need to store more than four sides 
per discriminator. The fanout of the O-tree is there- 
fore still smaller than in the standard R-tree. We 
suggest applying another technique and eliminating 
those side of the O-tree discriminators that add no 
or little information about the objects they describe. 
The partial O-tree is based on the complete O-tree 
representation of a discriminator, but only the four 
most descriptive sides are selected for storage in the 
entry. Less than four sides can be selected if the dis- 
criminator shares more than four sides with its par- 
ent. The node structure is similar to the partial R- 
tree representation. 




Figure 7: The node structure of a partial O-tree 



The node contains a pointer and a counter. The 
bitfields form the first part of the remaining node, 
but in the partial O-tree case contain 8 bits each, 
indicating which sides of the entry are stored in the 
node. The sides are stored again in descending order 
at the back of the node. In most cases, we will store 4 
sides per entry, but for some nodes, more information 
will be shared with the parent and the number of 
sides can be further reduced. Compared to the R- 
tree, we have slightly more space overhead with one 
additional byte per entry used as the bit vector, but 
we have almost halved the space requirements of the 
complete O-tree. 

4.3 Selecting the most descriptive data 

A partial O-tree node is created by starting off with 
the complete O-tree node and for each entry re- 
peatedly discarding the least useful information until 
the representation contains at most 4 sides. The al- 
gorithm OToPartial is the equivalent for O-trees of 
ToPartial. 

We first eliminate the sides of t[i].d which are 
shared with its parent by setting their bits in s[i].b 
to false. As long as this bit field (considered as a 
set) contains more than 4 sides, we repeatedly call 



EliminateSide to delete the side with the least import- 
ance, i.e., information, from the set of sides. Once we 
have reduced the set to at most four sides, we store 
the sides in the node and set the bitfield. 

EliminateSide chooses the side to eliminate which 
causes the least increase in area of the discriminator. 
The area of the discriminator is the intersection of the 
MBB and the orthogonal MBB. We discard the side 
for which the discriminator shows the least increase 
in area if the side is eliminated. 



Figure 8 shows an example for the elimination of 
sides. The minimum bounding box and orthogonal 
minimum bounding box depicted describe an object 
or a group of objects. In part (a) of the figure, all sides 
of both boxes are stored and their information de- 
scribes the area shaded grey. (Stored sides are shown 
as solid lines). We can see that the lower and up- 
per bounds of x do not add any extra information, as 
these bounds are given by the MOBB already. If we 
eliminate them, as shown in part (b), we still describe 
the same shaded area. Furthermore, we can observe 
that the lower and upper bounds of y add only little 
information about the object. Eliminating these from 
the discriminator results in an area which is slightly 
larger, but still a tight approximation of the object. 

4.4 Reconstructing the O-tree representation 

For search or dynamic insertion, we reconstruct the 
complete O-tree discriminators. As opposed to the 
partial R-tree, during the conversion from the com- 
plete O-tree representation to the partial O-tree rep- 
resentation, information may be lost. We only store 
an approximation of the original O-tree discrimin- 
ator, therefore the partial O-tree representation will 
be less accurate than the complete O-tree representa- 
tion, while still being more descriptive than the R-tree 
information. 

Algorithm OToComplete converts a partial O-tree 
node back into a complete O-tree node, i.e., an entry 
with eight sides. Although the entry is complete, it 
is only an approximation of the original complete O- 
tree representation. As for the partial R-tree, we start 
with copying the parent discriminator. We then re- 
place the values of the sides stored with their actual 
values and Tighten tighten the discriminator. The 
resulting discriminator is an approximation of the ori- 
ginal O-tree entry. 



OToPartial(t,pd) 
s.p :— t.p 
s.n := t.n 
s.l := 0 

for t := 1 to t.n 
for j := 0 to 7 
s[t].6[7'] := true 

if (pd\j} = t\i).d[j]) 
s[i].b\j] := false 
while ()s[i].b\ > 4) 

a[i].b := EliminateSide(pd, t[t].d, s[i].b) 
for j := 0 to 7 
if (s[*].6M) 
s.s[s.l] := t\i).d\j] 
s.l := + 1 
return s 
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(a) (b) (c) 

Figure 8: Eliminating sides of an O-tree discriminator 



OToComplete(s, pd) 
t.p := s.p 
t.n :— s.n 
l:=0 

for i := 1 to s.n 
d:= pd 
for j := 0 to 7 

d\j]:=s.s[l] 
/:=/ + ! 
t[i].d := Tighten(d) 
return t 




Figure 9: Tightening an O-tree discriminator 

Although we do not store all information about the 
O-tree discriminator, we may be able to reconstruct 
some more information from the sides we have stored. 
The MBB on axes x and y also gives bounds on axes 
v and w and vice versa. We can thus Tighten these 
bounds. Figure 9 illustrates tightening on a MBB for 
x and y and MBB for v and to (MOBB). The MBB for 
x and y determines a tighter lower bound for v and 
w } while the MBB for v and w determines a tighter 
upper bound for y. The tighter bounds are illustrated 
by dashed lines. 

In the function Tighten, boundaries of x and y 
based on the values of v and w of the orthogonal box 
are computed. The original values are replaced if the 
computed bounds are tighter. The same process is 
then performed for the orthogonal MBB. 



Tighten (d) 


d[0] := 


max(d[0],(d[4]-d[7])/v/2) 


d[l] := 


m\ n (d[l},{d[S\-d[6))/V2) 


d[2] := 


max(d[2],(d[4] + d[6])/N/2) 


d[3] := 


min(d[3],(d[5] + d(7])/v^) 


d[4] := 


max(d[4],(d[0\ + d[2})/,/2) 


d[5] := 


min(d[5],(d[l]+d(3])/N/2) 


d[6] := 


max(d[6],(d[2]-d[l])/v/2) 


d[7) := 


min(d[7),(d[3]-d[0])/y/2) 


return d 





4.5 Insertion in dynamic partial O-trees 

The insertion of entries in a partial O-tree is identical 
to the insertion for partial R-trees, replacing calls 
to ToPartial by OToPartial, etc. For both trees, the 
new representation of a node is based on the previ- 
ous representation stored in the tree. In the partial 
R-tree, this representation is a complete and accur- 
ate representation of the entry. In the partial O-tree, 
this representation is already an approximation of an 
entry. Creating a new partial O-tree representation 
will therefore produce another approximation of an 
already approximated entry. The quality of the rep- 
resentation of an entry therefore deteriorates every 
time a representation has to be changed and is re- 
calculated. During dynamic insertion this happens 
very frequently. The accuracy of object description in 
the partial dynamic O-tree is therefore very poor and 
leads to inaccurate search involving high cost. Par- 
tial dynamic O-trees therefore seem to be not useful 
in practice, although the very first approximation is 
a very accurate description of entries which provides 
more information than the R-tree bounding box. 

We therefore suggest using the partial O-tree ap- 
proach in a static environment. The partial static 
O-tree is generated from an existing O-tree and every 
discriminator in the tree is converted only once into 
the partial O-tree representation. The result is a par- 
tial O-tree which takes up only about half the space 
of the original O-tree but represents the entries in a 
more descriptive way than the R-tree. 

4.6 Building a static partial O-tree 

A static partial O-tree is created from a complete O- 
tree. We can determine the fanout of a partial O-tree 
for a given node size. A complete O-tree with that 
fanout is created. The nodes in the complete O-trees 
are about twice as large as the ones in the partial O- 
tree. We then read the complete O-tree and convert 
every discriminator into the partial representation by 
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applying the StoreSides algorithm. The result is a 
partial O-tree with nodes that fit in the given node 
size and with the fanout of the complete (larger) O- 
tree. As we only had to apply the approximation step 
once, the discriminators in the tree are very accurate. 

4.7 Searching partial O-trees 

We can search partial O-trees in two different ways 
which are different in cost and accuracy. The Ac- 
curateSearch shown below reconstructs the complete 
O-tree node using OToComplete before testing inter- 
section. 



AccurateSearch(s,pd, q) 
t := OToComplete(s,pd) 
for t := 1 to s.n 
if (intersected, g)) 
case (s) 

external: Output(s.p -h i) 

internal: AccurateSearch(s.p + i, t[t].d, q) 



The alternative is a faster, but less accurate search 
algorithm similar to the partial R-tree search. The 
entries are not completely reconstructed, only the 
stored sides are compared with the query. This re- 
duces the search time as no copying of the parent 
node, copying of entry sides and tightening have to be 
performed. FastSearch is exactly analogous to Search. 



FastSearch(s, e) 
/ :=0 

for i := 1 to s.n 
match :— true 
for j :- 0 to 7 
if(sf*].6[j]) 
if (dirn(j/) x s.s[/++] > 
dirn(i) x q.d[opp(j)])) 
match := false; break 
if (match) 
case (s) 

external: Output(s.p + 1) 
internal: FastSearch (5. p + t, q) 



Both search algorithms only implement the filter 
step which creates a set of candidate objects whose 
discriminator may intersect the query discriminator. 
The refinement step takes each candidate object in a 
subsequent step and checks its intersection with the 
query object. 

5 Experiments 

5.1 Description of Experiments 

Our experiments were conducted on a Sun Ultra-5 
with 270 MHz and 256 MByte RAM under Solaris 
2.6. A level-2 cache line on our architecture is 64 
bytes. We implemented the partial R-tree and par- 
tial O-tree using the CSB+-technique as described in 
Sections 3 and 4. We compared the partial R-trees to 
a normal R-tree which also uses the pointer elimin- 
ation technique. For a 64 byte cacheline, traditional 
O-trees cannot be used as the node can only contain 
2 entries. We therefore do not include results of the 
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query 



normal O-tree in our graphs and discussion. For all 
trees, we used the quadratic splitting algorithm. 

Our test data consists of a set of randomly con- 
structed line and polygon data relations. Each line 
data set contains a number of lines each with approx- 
imate length 20 in a square of area 5000. The polygon 
data sets contain convex polygons with up to 10 nodes 
and edges of length approximately 40 in a square area 
of 10000. The polygons are constructed by randomly 
creating the 10 points and using the Graham scan al- 
gorithm to calculate their convex hull. All test files 
fit completely into main memory. 

Our experiments measure the performance for par- 
tial R-trees and partial static O-trees and compare 
them with the search performance of the R-tree. For 
each test case, we queried each tree with 10,000 ran- 
dom queries and measured the number of node ac- 
cesses, search time, number of results and search 
times including the refinement step. 

5.2 Experimental Results 

Figures 10 and 11 show the average number of node 
accesses per query for line data and polygon data, 
respectively. Node accesses can be used as a rough 
measure for cache misses, i.e., it is the worst case 
number of cache misses. For the line data, the partial 
R-trees show a reduction of 30 per cent in the number 
of node accesses compared to the R-tree. The accur- 
ate partial O-tree reduces the number of node accesses 
by 60 per cent. The fast search on the partial O-tree 
still has 55 per cent less node accesses than the R- 
tree. For polygon data, the partial R-tree shows a 
similar improvement of 25 per cent. The partial O- 
trees, on the other hand, only reduce the number of 
node accesses by a marginal 1 per cent. 

Measuring the search time per query for line and 
polygon data, we obtained the results shown in Fig- 
ures 12 and 13. For line data, the partial R-tree im- 
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Figure 12: Average search time for line data 
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Figure 13: Average search time for polygon data 



proves on the normal R-tree by about 25 per cent. 
The partial O-tree with accurate search shows how 
expensive the accurate search is: although it could 
decrease the number of node accesses significantly, 
the gained time is used to perform expensive opera- 
tions. The search time of the partial accurate O-tree 
is 60 per cent higher than for the R-tree. The less 
accurate fast O-tree search algorithm, on the other 
hand, reduces the search time by up to 35 per cent. 
This shows clearly that, although the fast search is 
slightly less accurate and will search more subtrees, 
its faster execution time results in the best perform- 
ance. For polygon data, the partial R-tree reduces 
search time by up to 25 per cent compared to the 
normal R-tree. The partial O-trees do not improve 
on the R-tree. Again, the accurate O-tree search suf- 
fers from the high computation cost and shows an 
increase of 250 per cent. Even the fast O-tree search 
shows a slight increase in search time. While show- 
ing a great improvement for line data, the additional 
orthogonal bounding box of O-trees does not seem to 
help to make search on polygon data more efficient. 

Nevertheless, as shown in Figures 14 and 15, it 
still reduces the number of hits in the filter step sig- 
nificantly. We have included the number of results 
for a complete O-tree with the same fanout to have a 
measure of the best possible reduction in results. 

Compared to the R-tree and partial R-tree, both 
of which have the same number of hits, the complete 
O-tree reduces the number of results by 65 per cent 
for line data and 25 per cent for polygon data. The 
accurate search partial O-tree still achieves a 64 per 
cent reduction for line data and 23 per cent for poly- 
gon data. The fast O-tree search is slightly less ac- 
curate, but can still show a reduction of 64 per cent 
for line data and 20 per cent for polygon data. 

We can now compare the search time of the search 
trees if we take the refinement step into account. The 
refinement step takes the set of results obtained by 
searching the tree in the filter step and tests for ac- 
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Figure 15: Total number of results for polygon data 
(in thousands) 



tual intersection of the query object and the object 
in the candidate result set. The search times for this 
experiment are shown in Figures 16 and 17. 
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Figure 16: Average search time for line data including 
refinement time 



As the number of results does not change, the im- 
provement of the partial R-tree compared to the nor- 
mal R-tree is similar for search including the refine- 
ment step. For the O-trees, on the other hand, the 
relative performance to the R-tree has changed. The 
accurate search O-tree now shows a slight improve- 
ment over the R-tree for the large line data file and 
similar performance for the other line data sets. The 
fast search O-tree now shows a reduction in search 
time of up to 60 per cent, again for the larger files. 
For polygon data, even the reduced number of candid- 
ates for the refinement step does make the partial O- 
tree competitive. For the accurate search, the search 
time now is 60 higher than for the R-tree and still no 
improvement is shown for the fast O-tree search on 
polygon data. 
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Figure 17: Average search time for polygon data in- 
cluding refinement time 



6 Summary 

We investigated how to make better use of the space 
in a small R-tree node for in-memory applications. As 
many entries share sides with their parents, we intro- 
duced the partial R-tree which only stores informa- 
tion that is not given by the parent node. Our exper- 
iments showed that the partial R-tree shows better 
performance than the R-tree for random line queries 
on line and polygon data. The improvements range 
from 10 to 30 per cent. This is due to a higher fan- 
out of the node which was four, compared to three in 
the normal R-tree. We also investigated if we could 
make better use of the space by storing different in- 
formation that promises to yield a better approxim- 
ation of the entry. The partial O-tree is based on 
the O-tree but stores only the most important part of 
the information of an O-tree box. We implemented 
a static version of the partial O-tree and investigated 
two search algorithms. The fast search algorithm still 
shows enough accuracy and showed improvements of 
35 per cent for line data without the refinement step 
and 60 per cent improvement for line data with the 
refinement step. For polygon data, search could not 
be improved, but the static partial O-tree still showed 
stable performance similar to the R-tree. 
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7 Future Work 

We will investigate if we can further improve the fan- 
out of the tree by storing the bitfields encoded using 
the Huffman-encoding. Furthermore, we will also ex- 
periment with static partial O-trees in a disk-based 
database environment. 
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